Prediction of Kinetic Product Ratios: Investigation of a Dynamically Controlled Case

Of the various factors influencing kinetically controlled product ratios, the role of nonstatistical dynamics is arguably the least well understood. In this paper, reactions were chosen in which dynamics played a dominant role in product selection, by design. Specifically, the reactions studied were the ring openings of cyclopropylidene to allene and tetramethylcyclopropylidene to tetramethylallene (2,4-dimethylpenta-2,3-diene). Both reactions have intrinsic reaction coordinates that bifurcate symmetrically, leading to products that are enantiomeric once the atoms are uniquely labeled. The question addressed in the study was whether the outcomes—that is, which product well on the potential energy surface was selected—could be predicted from their initial conditions for individual trajectories in quasiclassical dynamics simulations. Hybrid potentials were developed based on cooperative interaction between molecular mechanics and artificial neural networks, trained against data from electronic structure calculations. These potentials allowed simulations of both gas-phase and condensed-phase reactions. The outcome was that, for both reactions, prediction of initial selection of product wells could be made with >95% success from initial conditions of the trajectories in the gas phase. However, when trajectories were run for longer, looking for “final” products for each trajectory, the predictability dropped off dramatically. In the gas-phase simulations, this drop off was caused by trajectories hopping between product wells on the potential energy surface. That behavior could be suppressed in condensed phases, but then new uncertainty was introduced because the intermolecular interactions between solute and bath, necessary to permit intermolecular energy transfer and cooling of the hot initial products, often led to perturbations of the initial directions of trajectories on the potential energy surface. It would consequently appear that a general ability to predict outcomes for reactions in which nonstatistical dynamics dominate remains a challenge even in the age of sophisticated machine-learning capabilities.


S2. Proof of the Translational and Rotational Invariance of the Chiral Metric
In the main text, the assertion is made that the determinant  is translationally and rotationally invariant. The proof of translational invariance is straightforward. It is a general property of determinants that adding one column, multiplied by an arbitrary scalar, to another leaves its value unchanged. Because the fourth column of  consists of 1s, that means that adding an arbitrary scalar to each element of any column leaves its value unchanged. However, such an addition corresponds to translating the structure along the axis specified by the chosen column.
Proof of rotational invariance is less straightforward. It was accomplished by using the Mathematica symbolic algebra program. 3 The steps are shown below.
Step 1, define a 3x4 matrix, q, for the Cartesian Coordinates of four atoms: Step 2, define a 3D rotation matrix for rotation by angle θ about axis (x 0 ,y 0 ,z 0 ): Step 3, premultiply the Cartesian coordinate matrix by the rotation matrix: Step 4, assemble the determinant χ from the transpose of rq plus a column of 1s.

S6
As described in section 3.1.1 of the main text, one of the methods tested for prediction of initially selected products from trajectory initial conditions involved the generation of a linear-synchronous-transit (LST) approximation to the IRC. For convenience, Fig. 6 from the main text is reproduced here as Fig. S1.
In reality, calculation of the LSQ-LST was carried out in the full-dimensional, mass-weighted Cartesian coordinates, rather than the 2D projection of Fig. SX1. This was accomplished as follows. Point i of the LSQ-LST was represented as: Here IRC 0 is the structure of saddle point SP1 in mass-weighted Cartesian coordinates and expressed as a 1D vector. The vector c consists of coefficients whose values are determined by minimizing the function: Where index i specifies which point of the IRC one is considering and index j specifies which coordinate of that point one is considering. Consequently, M is the number of points in the IRC, and N is the number of atoms in the molecule.
The 2D representation in Fig. S1 indicates that the distance between the true IRC and its LST approximation must be zero by construction at the geometry of SP1, but there should be a second point where the two cross. That turns out to be very nearly true for the fulldimensional analogs as well, as indicated in Fig. S2, which depicts the squared distance between the true IRC and its LST approximation for each point of the IRC. Figure S2. Squared distance (in mass-weighted coordinates) between the true IRC for cyclopropylidene ring opening and its optimum LST approximation. The red star indicates the reaction coordinate value at which the two almost cross.
The geometry corresponding to the near crossing point is depicted in Fig. S3, along with its pseudo-enantiomer, which would lead to the opposite product enantiomer. S7  Figure S3. Geometry at which the optimum LST and true IRC almost cross (upper right). The structure without atom labeling is chiral. Its enantiomer becomes a pseudo-enantiomer once atom labels are introduced -i.e., with labels included, the two structures are actually diastereomeric.
In principle, there could be eight total paths from the SP1 saddle point to the two enantiomeric allenes, when atom labels are considered, even with the same atom connectivities maintained. However, trajectories were initiated by sampling around one particular diastereomer of the labeled SP1, and the sampling procedure did not allow generation of the other diastereomeric structures, and so only one LST need be generated for each product enantiomer.
With the LSQ-LSTs established, the procedure for making a product prediction from initial conditions of a given trajectory was as follows. The starting structure was referred to its center of mass as origin, and then rotated in mass-weighted coordinates for maximum overlap with the orientation of SP1 used to generate the LSQ-LSTs. This was accomplished by using the Kabsch algorithm. 4 The rotation angles required to achieve maximum overlap were recorded and then applied to the mass-weighted Cartesian velocities. The resulting velocities were flattened into a 1D vector, which was normalized. Finally, the dot product of this vector was taken with the unit vectors pointing along the directions of the two LSQ_LSTs. The larger value of the dot product indicated the preferred product.

S4. Details of the Artificial Neural Networks Used in the Calculation of Energies and Derivatives for Cyclopropylidene and TMCP
The MM-ANN method described in the main text made use of feedforward artificial neural networks (ANNs), the details of which came directly from the work Artrith, Urban, and Ceder (AUC). 5 The incorporation of the ANNs into the TINKER molecular mechanics package was based on the work of Chen et al., 6 with differences noted below.
A feedforward ANN can be represented by the recurrence equation S4: where  l is the vector of values assigned to the neurons in the l th layer of the network. On the right-hand side of S4, f a l is the activation function for layer l, while w l and b l are vectors of weights and biases for the layer, whose values will be optimized during the training of the network. The values of  -1 are all set to unity, so that the input layer of the ANN,  0 , does not depend on a prior layer. The input layer accepts structural information about the molecule for which energies and derivatives are required. However, the ANN model must return values that are independent of translation and rotation of the molecule, and also independent of permutations among chemically equivalent atoms. These requirements rule out conventional Cartesian or internal coordinates as inputs to the ANN. The AUC solution to this problem draws on an idea originally proposed by Behler and Parrinello, which is to define different atom types within a molecule and to train separate ANNs for each type. The overall energy is then simply the sum of the atomic contributions.
This approach can be recognized to be similar in philosophy to Benson's group-additivity method for calculation of heats of formation for molecules. 7 Benson tabulated a large number of enthalpy contributions for various atom groups. A group would be defined by a central atom plus all the atoms to which it was directly connected. The heat of formation for a molecule could then be calculated by identifying all its groups and then simply adding up the contributions from each. Often, some additional correction factors would be required to capture features such as ring strain, which could not be embedded in the group contributions, but with these corrections included, final heats of formation within ±1 kcal mol of the experimental value were typical for "normal" organic molecules. 7 The AUC method expresses the environment of a particular atom type in terms of radial distribution functions (RDFs), which capture pairwise interactions between atoms and angular distribution functions (ADFs), which capture three-body interactions. The RDFs and ADFs are themselves expanded in a basis set of Chebyshev polynomials: In eq. S5, t specifies the list of atom types, and t i is the i th member of that list. T  is the Chebyshev polynomial of order . The limit  max is the highest order of polynomial at which the expansion is truncated. N is the total number of atoms in the structure, r jk is the distance between atoms j and k, R c is the cutoff distance chosen for the expansion, and f c (r jk ) is a cutoff function, defined in eq. S7. Chebyshev polynomials are defined by the recurrence relationship in eq. S6, with T 0 (x) = 1 and T 1 (x) = x.
The ADFs are defined by eq. S8: where  kjl is the angle made by atoms k and l at atom j.
In the original work, training of the ANNs involved minimizing the cost function C (eq. S9): is the total energy of the i th structure returned by all the relevant atomic ( , ) ANNs, meaning that w and b in eq. S9 are vectors of weights and biases that encompass all layers of all atomic ANNs. N data is the total number of structures for which one has calculated energies and derivatives using the chosen electronic structure method.
is the reference quantum-mechanical energy for the i th structure. In the present work, the original cost function C was replaced by the new cost function F, defined in eq. S10. (S10) Here, is the energy of the i th structure at the MM3 molecular-mechanics level. The

3
ANNs were thus trained to provide corrections to the MM3 energies and gradients. Minimization of F with respect to w and b was carried out by the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm, as in the original work.
All the calculations implied by eqs. S5-S10 were carried out with the free and open-source packages aenet and aenet-tinker. 5 For the cyclopropylidene MM-ANN potential, three atom types were identified: C, Cq, and H, as illustrated below: Three atomic ANNs were thus required, called C.ann, Cq.ann, and H.ann, respectively. All three ANNs had the following parameters: For the TMCP MM-ANN potential, four atom types were identified, as illustrated below: The four atomic ANNs: Ca.ann, Cc.ann, Cm.ann, and Hm.ann each had the same parameters listed above for the cyclopropylidene ANNs. Each of the ANNs had all four atom types in its environment, except for Cc.ann, which only had atom types Ca, Cm, and Hm in its environment.
In order to carry out the trajectory calculations in the presence of bath atoms or molecules, it was necessary to create dummy ANNs for each of the bath atom types. Even though the intent was to evaluate bath energy terms and bath-solute interaction terms purely by MM3 molecular mechanics, the aenet-tinker code expected to find an ANN for every atom type present. The dummy ANNs were created by training against data sets in which bath atoms were randomly located within their periodic-boundary box, with each of the random configurations being given a correction energy near zero. The energies could not be set exactly to zero because an internal rescaling within the code would overflow if that was attempted, but energies of ±10 -7 kcal/mol were accepted.

S5. Details of the Boosted Tree Machine Learning Model for Predicting the Initially Accessed Minima of TMCP Ring-Opening Trajectories
As described in the main text, Cartesian coordinates and conjugate momenta of the initial structures were converted to internal-coordinate equivalents by computing the Wilson B matrix, whose elements are , where q is the vector of internal coordinates and x is the ∂ ∂ vector of Cartesian coordinates. This was accomplished in Python, using the autograd module (https://github.com/HIPS/autograd/blob/master/docs/tutorial.md, accessed 12/15/2022) to compute the co-ordinate derivatives. The trajectories were run for 150 fs, using the MM-ANN potential for TMCP. At the end of each trajectory, the result was classified as M, P, or R for, respectively, a product with a negative value of the chirality metric, a product with a positive value of the chirality metric or a recrossing trajectory that returned to the reactant. The initial